BOOKKEEPER-968 Entry log flushes at configurable chunks#77
Conversation
|
Failed test seems to be unrelated/flapper |
|
@dlg99 Have you considered the approach of using a rate limiter (eg: guava |
|
On the other hand, we might even be able to do some auto-tuning: If I was only able to write let's say 200 MB in 1 second without the throttling. Next time I'll try to throttle at 190MB/s to not saturate the disk. |
|
@merlimat limiting write rate won't help. One can write with relatively low write rate but since entry log only gets fileChannel.force() on log rotation we end up with a lot of writes cached in memory. One can experiment with linux config (dirty_write_bytes / backround_dirty_write_bytes IIRC) but these are OS-wide setting (not per disk) and will affect other writes, i.e. I would not want to decrease these parameters to 10-50M for the rotational disk where we write application logs. Limiting these to the range of hundreds MB does not help much with the specific problem on hands. |
|
@sijie , should we add items in doc/bookkeeperConfigParams.textile along with this changes? In this link: But seems, this textile file contains only the tip of the iceberg, a lot of config parameter is not added in. If needed, we may need a ticket to fix this. |
merlimat
left a comment
There was a problem hiding this comment.
Change looks good to me.
|
+1 on what @jiazhai suggested. we should document all the new configurations. |
|
@jvrao yup. another jira is fine. We need some jiras for documenting the settings introduced in 4.5.0 and mark them as blockers for 4.5.0 release. |
|
+1 for this change |
* changed all instances of `assert()` to junit versions in `src/test`
Here is the script I used to find all instances of `assert()` inside the `src/test` folder:
```
grep -r "assert(" * 2>/dev/null | grep -v "main"
```
The `grep -v "main"` removes all instances of the usage within the main source tree (which there are quite a few). I did this as I assumed the JIRA ticket spirit was not to remove those instances from the main tree and thus would require including `junit` in core compilation rather than scoped for `test` as is.
Author: Brennon York <brennon.york@capitalone.com>
Reviewers: Sijie Guo <sijie@apache.org>
Closes apache#77 from brennonyork/DL-122
) * (@bug W-4698540@) Increase openLedgerRereplicationGracePeriod openLedgerRereplicationGracePeriod defines how long the replication worker waits to fence an active extent before starting replication. The default value is 30 sec. Which should be enough for the writer to detect the bookie down incident and make an ensemble change. But given that we don't have back-pressure, sometimes client and bookies get backed up and may take longer time to detect and change the ensemble. Changing this value to 10 mins. Making this longer should not have any major impact as the writer is confirmed to everything it wrote. Also now we have explicitLAC so even delaying the fencing won't stop/restrict reader reading even the last entry/fragment Signed-off-by: Venkateswararao Jujjuri (JV) <vjujjuri@salesforce.com> * Applied review comments
With this patch one can configure interval (in bytes) for entry log to flush writes to the disk.